Quantitative Biology
○ Wiley
All preprints, ranked by how well they match Quantitative Biology's content profile, based on 11 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Shao, N.; Pan, H.; Li, X.; Li, W.; Wang, S.; Xuan, Y.; Yan, Y.; Yu, J.; Liu, K.; Chen, Y.; Xu, B.; Luo, X.; Shen, C. Y.; Zhong, M.; Xu, X.; Chen, X.; Lu, S.; Ding, G.; Cheng, J.; Chen, W.
Show abstract
COVID-19 has been impacting on the whole world critically and constantly Since December 2019. We have independently developed a novel statistical time delay dynamic model on the basis of the distribution models from CCDC. Based only on the numbers of confirmed cases in different regions in China, the model can clearly reveal that the containment of the epidemic highly depends on early and effective isolation. We apply the model on the epidemic in Japan and conclude that there could be a rapid outbreak in Japan if no effective quarantine measures are carried out immediately.
Peng, L.; Yang, W.; Zhang, D.; Zhuge, C.; Hong, L.
Show abstract
The outbreak of novel coronavirus-caused pneumonia (COVID-19) in Wuhan has attracted worldwide attention. Here, we propose a generalized SEIR model to analyze this epidemic. Based on the public data of National Health Commission of China from Jan. 20th to Feb. 9th, 2020, we reliably estimate key epidemic parameters and make predictions on the inflection point and possible ending time for 5 different regions. According to optimistic estimation, the epidemics in Beijing and Shanghai will end soon within two weeks, while for most part of China, including the majority of cities in Hubei province, the success of anti-epidemic will be no later than the middle of March. The situation in Wuhan is still very severe, at least based on public data until Feb. 15th. We expect it will end up at the beginning of April. Moreover, by inverse inference, we find the outbreak of COVID-19 in Mainland, Hubei province and Wuhan all can be dated back to the end of December 2019, and the doubling time is around two days at the early stage.
Wang, H. M.; Lou, J.; Cao, L.; Zhao, S.; Chan, P. K.; Chan, M. C.-W.; Chong, M. K.; Wu, W. K.; Chan, R. W.; Wei, Y.; Zhang, H.; Zee, B. C.; Yeoh, E.-k.
Show abstract
Virus evolution drives the annual influenza epidemics in human population worldwide. However, it has been challenging to evaluate the mutation effect of the influenza virus on evading the population immunity. In this study, we introduce a novel statistical and computational approach to measure the dynamic molecular determinants underlying epidemics by the effective mutations (EMs), and account for the time of waning mutation advantage against herd immunity by the effective mutation periods (EMPs). Extensive analysis is performed on the genome and epidemiology data of 13-year worldwide H3N2 epidemics involving nine regions in four continents. We showed that the identified EM processed similar profile in geographically adjacent regions, while only 40% are common to Europe, North America, Asia and Oceania, indicating that the regional specific mutations also contributed significantly to the global H3N2 epidemics. The mutation dynamics calibrated that around 90% of the common EMs underlying global epidemics were originated from South East Asia, led by Thailand and India, and the rest were originated from North America. New Zealand was found to be the dominate sink region of H3N2 circulation, followed by UK. All regions might act as the intersection in the H3N2 transmission network. The proposed methodology provided a way to characterize key amino acids from the genetic epidemiology point of view. This approach is not restricted by the genomic region or type of the virus, and will find broad applications in identifying therapeutic targets for combating infectious diseases.
Zhang, Y.; Jiang, B.; Yuan, J.; Tao, Y.
Show abstract
The outbreak of coronavirus disease 2019 (COVID-19) which originated in Wuhan, China, constitutes a public health emergency of international concern with a very high risk of spread and impact at the global level. We developed data-driven susceptible-exposed-infectious-quarantine-recovered (SEIQR) models to simulate the epidemic with the interventions of social distancing and epicenter lockdown. Population migration data combined with officially reported data were used to estimate model parameters, and then calculated the daily exported infected individuals by estimating the daily infected ratio and daily susceptible population size. As of Jan 01, 2020, the estimated initial number of latently infected individuals was 380.1 (95%-CI: 379.8[~]381.0). With 30 days of substantial social distancing, the reproductive number in Wuhan and Hubei was reduced from 2.2 (95%-CI: 1.4[~]3.9) to 1.58 (95%-CI: 1.34[~]2.07), and in other provinces from 2.56 (95%-CI: 2.43[~]2.63) to 1.65 (95%-CI: 1.56[~]1.76). We found that earlier intervention of social distancing could significantly limit the epidemic in mainland China. The number of infections could be reduced up to 98.9%, and the number of deaths could be reduced by up to 99.3% as of Feb 23, 2020. However, earlier epicenter lockdown would partially neutralize this favorable effect. Because it would cause in situ deteriorating, which overwhelms the improvement out of the epicenter. To minimize the epidemic size and death, stepwise implementation of social distancing in the epicenter city first, then in the province, and later the whole nation without the epicenter lockdown would be practical and cost-effective.
Gu, X.
Show abstract
Current cancer genomics databases have accumulated millions of somatic mutations that remain to be further explored, faciltating enormous high throuput analyses to explore the underlying mechanisms that may contribute to malignant initiation or progression. In the context of over-dominant passenger mutations (unrelated to cancers), the challenge is to identify somatic mutations that are cancer-driving. Under the notion that carcinogenesis is a form of somatic-cell evolution, we developed a two-component mixture model that enables to accomplish the following analyses. (i) We formulated a quasi-likelihood approach to test whether the two-component model is significantly better than a single-component model, which can be used for new cancer gene predicting. (ii) We implemented an empirical Bayesian method to calculate the posterior probabilities of a site to be cancer-driving for all sites of a gene, which can be used for new driving site predicting. (iii) We developed a computational procedure to calculate the somatic selection intensity at driver sites and passenger sites, respectively, as well as site-specific profiles for all sites. Using these newly-developed methods, we comprehensively analyzed 294 known cancer genes based on The Cancer Genome Atlas (TCGA) database.
Sun, N.; Yu, H.; Ren, R.; Zhou, T.; Guan, M.; Zhao, L.; Yau, S. S.-T.
Show abstract
Understanding the differences between genomic sequences of different lives is crucial for biological classification and phylogeny. Here, we downloaded all the reliable sequences of the seven kingdoms and determined the dimensions of the genome space embedded in the Euclidean space, along with the corresponding Natural Metrics. The concept of the Grand Biological Universe is further proposed. In the grand universe, the convex hulls formed by the universes of seven kingdoms are mutually disjoint, and the convex hulls formed by different biological groups within each kingdom are mutually disjoint. This study provides a novel geometric perspective for studying molecular biology and also offers an accurate way for large-scale sequence comparison in a real-time manner. Most importantly, this study shows that, due to the space-time distortion in the biological genome space similar to Einsteins theory, it is futile to look for a single metric to measure different biological universes, as previous studies have done.
Yuan, H.-Y.; Mao, A.; Han, G.; Yuan, H.; Pfeiffer, D.
Show abstract
The rapid expansion of COVID-19 has caused a global pandemic. Although quarantine measures have been used widely, the critical steps among them to suppress the outbreak without a huge social-economic loss remain unknown. Hong Kong, unlike other regions in the world, had a massive number of travellers from Mainland China during the early expansion period, and yet the spread of virus has been relatively limited. Understanding the effect of control measures to reduce the transmission in Hong Kong can improve the control of the virus spreading. We have developed a susceptible-exposed-infectious-quarantined-recovered (SEIQR) meta-population model that can stratify the infections into imported and subsequent local infections, and therefore to obtain the control effects on transmissibility in a region with many imported cases. We fitted the model to both imported and local confirmed cases with symptom onset from 18 January to 29 February 2020 in Hong Kong with daily transportation data and the transmission dynamics from Wuhan and Mainland China. The model estimated that the reproductive number was dropped from 2.32 to 0.76 (95% CI, 0.66 to 0.86) after an infected case was estimated to be quarantined half day before the symptom onset, corresponding to the incubation time of 5.43 days (95% CI, 1.30-9.47). If the quarantine happened about one day after the onset, community spread would be likely to occur, indicated by the reproductive number larger than one. The results suggest that the early quarantine for a suspected case before the symptom onset is a key factor to suppress COVID-19.
Xu, Z.; Zhang, H.; Niu, Y.
Show abstract
It is of great theoretical and application value to accurately forecast the spreading dynamics of COVID-19 epidemic. We first proposed and established a Bayesian model to predict the epidemic spreading behavior. In this model, the infection probability matrix is estimated according to the individual contact frequency in certain population group. This infection probability matrix is highly correlated with population geographic distribution, population age structure and so on. This model can effectively avoid the prediction malfunction by using the traditional ordinary differential equation methods such as SIR (susceptible, infectious and recovered) model and so on. Meanwhile, it would forecast the epidemic distribution and predict the epidemic hot spots geographically at different time. According to the results revealed by Bayesian model, the effect of population geographical distribution should be considered in the prediction of epidemic situation, and there is no simple derivation relationship between the threshold of group immunity and the virus reproduction number R0. If we further consider the virus mutation effect and the antibody attenuation effect, with a large global population spatial distribution, it will be difficult for us to eliminate Covid-19 in a short time even with vaccination endeavor. Covid-19 may exist in human society for a long time, and the epidemic caused by re-infection is characterized by a wild-geometric && low-probability distribution with no epidemic hotspots.
DENG, Q.
Show abstract
The mainstream compartmental models require stochastic parameterization to estimate the transmission parameters between compartments, which depends upon detailed statistics on epidemiological characteristics that are economically and resource-wide expensive to collect. As an alternative, deep learning techniques are effective in estimating these stochastic parameters with greatly reduced dependency on data particularity. We apply deep learning to estimate transmission parameters of a customized compartmental model, then feed the estimated transmission parameters to the compartmental model to predict the development of the Omicron epidemics in China for 28 days. The average levels of predication accuracy of the model are 98% and 92% for number of infections and deaths, respectively.
Su, Z.; Almo, S.; Wu, Y.
Show abstract
The use of bispecific antibodies as T cell engagers can bypass the normal TCR-MHC interaction, redirect the cytotoxic activity of T-cells, and lead to highly efficient tumor cell killing. However, this immunotherapy also causes significant on-target off-tumor toxicologic effects, especially when they were used to treat solid tumors. In order to avoid these adverse events, it is necessary to understand the fundamental mechanisms during the physical process of T cell engagement. We developed a multiscale computational framework to reach this goal. The framework combines simulations on the intercellular and multicellular levels. On the intercellular level, we simulated the spatial-temporal dynamics of three-body interactions among bispecific antibodies, CD3 and TAA. The derived number of intercellular bonds formed between CD3 and TAA were further transferred into the multicellular simulations as the input parameter of adhesive density between cells. Through the simulations under various molecular and cellular conditions, we were able to gain new insights of how to adopt the most appropriate strategy to maximize the drug efficacy and avoid the off-target effect. For instance, we discovered that the low antibody binding affinity resulted in the formation of large clusters at the cell-cell interface, which could be important to control the downstream signaling pathways. We also tested different molecular architectures of the bispecific antibody and suggested the existence of an optimal length in regulating the T cell engagement. Overall, the current multiscale simulations serve as a prove-of-concept study to help the future design of new biological therapeutics. SIGNIFICANCET-cell engagers are a class of anti-cancer drugs that can directly kill tumor cells by bringing T cells next to them. However, current treatments using T-cell engagers can cause serious side-effects. In order to reduce these effects, it is necessary to understand how T cells and tumor cells interact together through the connection of T-cell engagers. Unfortunately, this process is not well studied due to the limitations in current experimental techniques. We developed computational models on two different scales to simulate the physical process of T cell engagement. Our simulation results provide new insights into the general properties of T cell engagers. The new simulation methods can therefore serve as a useful tool to design novel antibodies for cancer immunotherapy.
Chu, X.; Wang, J.
Show abstract
Cell state transitions or cell fate decision making processes, such as cell development and cell pathological transformation, are believed to be determined by the regulatory network of genes, which intimately depend on the structures of chromosomes in the cell nucleus. The high temporal resolution picture of how chromosome reorganizes its 3D structure during the cell state transitions is the key to understanding the mechanisms of these fundamental cellular processes. However, this picture is still challenging to acquire at present. Here, we studied the chromosome structural dynamics during the cell state transitions among the pluripotent embryonic stem cell (ESC), the terminally differentiated normal cell and the cancer cell using landscape-switching model implemented in the molecular dynamics simulation. We considered up to 6 transitions, including differentiation, reprogramming, cancer formation and reversion. We found that the pathways can merge at certain stages during the transitions for the two processes having the same destination as the ESC or the normal cell. Before reaching the merging point, the two pathways are cell-type-specific. The chromosomes at the merging points show high structural similarity to the ones at the final cell states in terms of the contact maps, TADs and compartments. The post-merging processes correspond to the adaption of the chromosome global shape geometry through the chromosome compaction without significantly disrupting the contact formation. On the other hand, our detailed analysis showed no merging point for the two cancer formation processes initialized from the ESC and the normal cell, implying that cancer progression is a complex process and may be associated with multiple pathways. Our results draw a complete molecular picture of cell development and cancer at the dynamical chromosome structural level, and help our understanding of the molecular mechanisms of cell fate decision making processes.
Lu, J.
Show abstract
BackgroundWith the worldwide outbreak of COVID-19, an accurate model to predict how the coronavirus pandemic will evolve in individual countries becomes important and urgent. Our goal is to provide a prediction model to help policy makers in different countries address the epidemic outbreak and adjust the control policies to contain the spread of the severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) more effectively. MethodsUnlike the classic public health and virus propagation models, this new projection model takes both government intervention and public response into account to generate reliable projections of the outbreak 10 days to 2 weeks in advance. This method is an observation based projection similar than the classic Moores Law in miroelectronics. The Moores law is not based on any physics law and yet has anticipated the development of microelectronics for decades. This work is an empirical relation to decribe the evolution of epidemic to pandemic situations in different countries. The country was selected as an observation unit because the regulation and political decision is an national decision for numerous measures such as the implementation of social distancing, the quarantine of suspected cases, and the closing of borders to achieve territorial containment. FindingsThis model has been successfully applied to predict the evolution of pendemic situation in China. Then the model was also validated by the South Korean data. With a reduction of cases calculated as reduction coefficient of the increase rate of daily cases Rc = 2% per day, we observed a very efficient policy with a strict systematic control in both China and South Korea. For the moment, the Canada, USA, Australia may have difficulties to limit the fast evolution of the epidemic. With a Rc<0.5%, its particularly important for the USA to consider escalating the control measures because the affected cases can reach more than one million very soon. InterpertationDue to the difference of national disciplines and historical culture, the national policy may be implemented and observed with different efficiency. The starting point where the government decided to apply total containment can also play a key role for the evolution of the pendemic situation. The model will allow each national government of the nations still affected by the pandemic to project the situation for the coming 10 to 14 days. Its very important for the deployment of national and international efforts to stop the pandemic situation. FundingNational Key R&D Program of China (Ministry of Science & Technology (MOST, China))
Yao, Y.; Zhu, J.; Li, W.; Pei, D.
Show abstract
The cell fate transition is a fundamental characteristic of living organisms. By introducing external perturbations, it is possible to artificially intervene in cell fate and trigger cell reprogramming. Revealing the general principle underlying the induced phenotypic reshaping of cell populations remains a central focus in the field of cell biology. In this study, we investigate the energetic and dynamic features of induced cell phenotypic transition from differentiated somatic state to pluripotent state by constructing a Boolean genetic network model. The simulation and experimental results highlight the critical role of genetic frustration in initiating cell fate transitions, although the two ending phenotypic states are typically featured by minimal frustration. In addition, the altered gene expression profiles exhibit a scale-free distribution, suggesting that there exist a small number of critical genes responsible for the cell fate transition. This study provides important insights into the dynamic principles governing effective cell reprogramming caused by artificial or exogenous interventions.
Chen, S.
Show abstract
In this work, we establish and evolve an artificial metabolic system in silicon to shed light on how the metabolic mechanism emerged. This system is composed of two subsystems: the artificial genome subsystem (AGS) and the artificial metabolite subsystem (AMS). The whole system is designed to be capable of being autonomous: the dynamics of AGS is capable of situating itself to the dynamics of AMS to provide it with enzymes in the right time and quantity; the dynamics of AMS is capable of implementing the metabolic function and harvest energy so as to pay back the energy consumption of AGS. This kind of autonomous state requires an intricate structure of the AGS. So it is almost impossible to be predetermined manually. With the help of an evolutionary computational method that has a hierarchical mutational structure, the artificial metabolic system with this kind of autonomous state eventually emerged in silicon. We find that ATP and ADP molecules have an important role in making the state of the system autonomous. We also find that the emerged structure of AGS ensemble existing biological structures in the natural cells.
Lu, Y.; Liu, X.; Zhang, Z.
Show abstract
Assembly of a protein complex is very important to its biological function, which can be investigated by determining assembly/disassembly order of its protein subunits. Although static structures of many protein complexes are available in the protein data bank, their assembly/disassembly orders of subunits are largely unknown. In addition to experimental techniques for studying subcomplexes in the assembly/disassembly of a protein complex, computational methods can be used to predict the assembly/disassembly order. Since sampling is a nontrivial issue in simulating the assembly/disassembly process, coarse-grained simulations are more efficient than atomic simulations are. In this work, we developed computational protocols for predicting assembly/disassembly orders of protein complexes using coarse-grained simulations. The protocols were illustrated using two protein complexes, and the predicted assembly/disassembly orders are consistent with available experimental data.
Ghaemi, Z.; Nafiu, O.; Tajkhorshid, E.; Gruebele, M.; Hu, J.
Show abstract
Despite a vaccine, hepatitis B virus (HBV) remains a world-wide source of infections and deaths, and tackling the infection requires a multimodal approach against the virus. We develop a whole-cell computational platform combining spatial and kinetic models for the infection cycle of a virus host cell (hepatocyte) by HBV. We simulate a near complete viral infection cycle with this whole-cell platform stochastically for 10 minutes of biological time, to predict viral infection, map out virus-host as well as virus-drug interactions. We find that with an established infection, decreasing the copy number of the viral envelope proteins can shift the dominant infection pathways from secreting the capsids from the cell to re-importing the capsids back to the nucleus, resulting in higher viral DNA referred to as covalently closed circular DNA (cccDNA) copy number. This scenario can mimic the consequence of drugs designed to manipulate viral gene expression (such as siRNAs). Viral capsid mutants lead to their destabilization such that they disassemble at nuclear pore complexes, result in an increase in cccDNA copy number. However, excessive destabilization leading to cytoplasmic disassembly does not increase the cccDNA copy number. Finally, our simulations can predict the best drug dosage and timing of its administration to reduce the cccDNA copy number which is the hallmark of infection. Our adaptable computational platform can be utilized to study other viruses, more complex host-virus interactions, and identify the most central viral pathways that can be targeted by drugs or a combination of them.
Li, X.; Li, H.; Yang, Z.; Zhang, Z.
Show abstract
Exploring the composition and evolution regularity of genome sequences and constructing phylogenetic relationship by alignment-free method in genome level are high-profile topics. Our previous researches discovered the CG and TA independent selection law s existed in genome sequences by analysis on the spectral features of 8-mer subsets of 920 eukaryote and prokaryote genomes. We found that the evolution state of genomes is determined by the intensity of the two independent selections and the degree of the mutual inhibition between them. In this study, the two independent selection patterns of 22 primate and 28 insect genome sequences were analyzed further. The two complete 8-mer motif sets containing CG or TA dinucleotide and their feature of relative frequency are proposed. We found that the two 8-mer sets and their feature are related directly to sequence evolution of genomes. According to the relative frequency of two 8-mer sets, phylogenetic trees were constructed respectively for the given primate and insect genomes. Through analysis and comparison, we found that our phylogenetic trees are more consistent with the known conclusions. The two kinds of phylogenetic relationships constructed by CG 8-mer set and TA 8-mer set are similar in insect genomes, but the phylogenetic relationship constructed by CG 8-mer set reflect the evolution state of genomes in current age and phylogenetic relationship constructed by TA 8-mer set reflect the evolution state of genomes in a slight earlier period. We thought it is the result that the TA independent selection is repressed by the CG independent selection in the process of genome evolution. Our study provides a theoretical approach to construct more objective evolution relationships in genome level.
Qin, Z.; Zhang, H.; Huang, J.; Gao, Q.; Tang, Y.; Wu, Y.
Show abstract
Natural products are important sources for drug development, and the precise prediction of their structures assembled by modular proteins is an area of great interest. In this study, we introduce DeepT2, an end-to-end, cost-effective, and accurate machine learning platform to accelerate the identification of type II polyketides (T2PKs), which represent a significant portion of the natural product world. Our algorithm is based on advanced natural language processing models and utilizes the core biosynthetic enzyme, chain length factor (CLF or KS{beta}), as computing inputs. The process involves sequence embedding, data labeling, classifier development, and novelty detection, which enable precise classification and prediction directly from KS{beta} without sequence alignments. Combined with metagenomics and metabolomics, we evaluated the ability of DeepT2 and found this model could easily detect and classify KS{beta} either as a single sequence or a mixture of bacterial genomes, and subsequently identify the corresponding T2PKs in a labeled categorized class or as novel. Our work highlights deep learning as a promising framework for genome mining and therefore provides a meaningful platform for discovering medically important natural products.
Zhu, Y.; Gu, J.; Qiu, Y.; Chen, S.
Show abstract
The real-world performance of vaccines against COVID-19 infections is critically important to counter the pandemics. We propose a varying coefficient stochastic epidemic model to estimate the vaccine protection rates based on the publicly available epidemiological and vaccination data. To tackle the challenges posed by the unobserved state variables, we develop a multi-step decentralized estimation procedure that uses different data segments to estimate different parameters. A B-spline structure is used to approximate the underlying infection rates and to facilitate model simulation in obtaining an objective function between the imputed and the simulation-based estimates of the latent state variables, leading to simulation-based estimation of the diagnosis rate using data in the pre-vaccine period and the vaccine effect parameters using data in the post-vaccine periods. And the time-varying infection, recovery and death rates are estimated by kernel regressions. We apply the proposed method to analyze the data in ten countries which collectively used 8 vaccines. The analysis reveals that the average protection rate of the full vaccination was at least 22% higher than that of the partial vaccination and was largely above the WHO recognized level of 50% before November 20, 2021, including the Delta variant dominated period. The protection rates for the booster vaccine in the Omicron period were also provided.
Kanazawa, T.; Yao, T.; Takeshita, S.; Hirai, T.; Suenaga, R.; Yamada, T. G.; Tokuoka, Y.; Yamagata, K.; Funahashi, A.
Show abstract
1Selection of high-quality embryos is critical in assisted reproductive technology (ART), but it relies on visual assessment by experts, and the birth rate remains low. We previously developed a deep learning method to predict the birth of mouse embryos by quantifying the morphological features of cell nuclei. This method involves cell nuclear segmentation on fluorescence microscopy images, but fluorescence labeling of nuclei is not feasible in medical applications. Here, we developed FL2-Net, a nuclear segmentation method for time-series three-dimensional bright-field microscopy images of mouse embryos without fluorescence labeling. FL2-Net outperformed existing state-of-the-art segmentation methods. We predicted the birth potential of mouse embryos from the nuclear features quantified by bright-field microscopy image segmentation. Birth prediction accuracy of FL2-Net (81.63%) exceeded those of other methods and experts (55.32%). We expect that FL2-Net, which can quantify nuclear features of embryos non-invasively and with high accuracy, might be useful in ART.